Dysfunctional Sars-CoV-2-M protein-specific cytotoxic T lymphocytes in patients recovering from severe COVID-19

Although the importance of virus-specific cytotoxic T lymphocytes (CTL) in virus clearance is evident in COVID-19, the characteristics of virus-specific CTLs related to disease severity have not been fully explored. Here we show that the phenotype of virus-specific CTLs against immunoprevalent epitopes in COVID-19 convalescents might differ according to the course of the disease. We establish a cellular screening method that uses artificial antigen presenting cells, expressing HLA-A*24:02, the costimulatory molecule 4-1BBL, SARS-CoV-2 structural proteins S, M, and N and non-structural proteins ORF3a and nsp6/ORF1a. The screen implicates SARS-CoV-2 M protein as a frequent target of IFNγ secreting CD8+ T cells, and identifies M198–206 as an immunoprevalent epitope in our cohort of HLA-A*24:02 positive convalescent COVID-19 patients recovering from mild, moderate and severe disease. Further exploration of M198–206-specific CD8+ T cells with single cell RNA sequencing reveals public TCRs in virus-specific CD8+ T cells, and shows an exhausted phenotype with less differentiated status in cells from the severe group compared to cells from the moderate group. In summary, this study describes a method to identify T cell epitopes, indicate that dysfunction of virus-specific CTLs might be an important determinant of clinical outcomes.

Profound depletion of T cells in peripheral bloods was present presumably due to overproduction of inflammatory cytokines 17 . Thus, dysregulation of both innate and acquired immunity could be associated with COVID-19 severity. However, it is not fully understood how differentiation status of antigen/virus-specific CTLs is related to disease severity. Since antigen/virus-specific CTLs are essential sentinels against invading pathogens, thorough characterization of these in COVID-19 patients is thought to be necessary to unveil the mechanisms of COVID-19 progression, as well as to inform therapeutic strategies.
In this work, we explore immunoprevalent epitopes from COVID-19 convalescents for examining SARS-CoV-2-specific CTLs. We perform deep and thorough analysis to characterize the SARS-CoV-2 immunoprevalent epitope-specific CTLs for searching the features related to disease severity. The identified traits of dysfunction of the cells of severe COVID-19 convalescents highlight the impaired virus-specific CTL development as a possible determinant of clinical outcomes.

M protein is an immunocompetent viral protein
In order to achieve an in-depth analysis of SARS-CoV-2-specific CD8 + T cells, we generated CD8 + T cell libraries from peripheral blood cells of COVID-19 convalescents. These libraries are useful for the definitive characterization of low frequency, but potentially crucial, CD8 + T cell populations in peripheral blood because rare, virus-specific CD8 + T cells are polyclonally expanded followed by stimulation with relevant antigen-expressing aAPCs (Fig. 1a). COVID-19 convalescents enrolled in this study were admitted to, or consulted with a physician, at hospitals affiliated with Hyogo College of Medicine or Kyowa-kai Medical Corporation from late 2020. Convalescents had experienced mild, moderate, or severe COVID-19 (Supplementary Table 1). Disease severity of COVID-19 was classified according to a recent report 22 (see Methods). Since HLA-A * 24:02 was the dominant allele in our study group, we focused on HLA-A * 24:02 + subjects and a total of 36 COVID-19 convalescents and 9 healthy volunteers were enrolled in the study (Supplementary Table 1), and 20 convalescents and 8 healthy volunteers among them were subjected to the library assay. A small number (two thousand) CD45RO + CD8 + T cells isolated from each participant's PBMC were seeded into each well of 96-well-plates and polyclonally expanded with PHA in the presence of allogenic PBMC and cytokines to establish a library (Fig. 1a). All the established libraries from each participant were evenly divided into 7 groups (i.e., 5 groups for a series of SARS-CoV-2 proteins, 1 group for mixed peptides derived from influenza virus and cytomegalovirus, and 1 group for a non-antigen negative control) and incubated with relevant aAPCs individually. IFNγ in the culture supernatant was measured as a functional read-out for antigen-specific CD8 + T cells, because it is a major effector cytokine of various virus-specific human CD8 + T cells [23][24][25][26][27] and because IFNγ + CD8 + T cells are strongly associated with low disease severity among acute cases of COVID-19 28 . The response to the antigens was defined as positive when the IFNγ level was above mean +3 SD of that in the negative control (i.e., incubation with parental aAPCs). In total, 5403 libraries were examined in COVID-19 convalescents and healthy volunteers (3808 and 1595, respectively). Frequencies of IFNγ−positive libraries upon each antigen stimulation were compared among participant groups (i.e., COVID-19 convalescents and healthy volunteers). There were minimum responses observed in the non-antigen negative control groups of healthy and COVID-19 subjects; representative data are shown in Fig. 1b. Consistent with the previous reports, the response was observed against various antigens including structural and nonstructural proteins ( Supplementary Figs. 2, 3). Of note, those included reported immunodominant epitopes: S 1208-1216 29 and ORF3a 112-120 30 ( Supplementary Fig. 3b). Importantly, SARS-CoV-2-M protein induced high responses in the libraries of convalescents compared with those of healthy volunteers (p = 0.0019, Mann-Whitney U test) (Fig. 1c, d) ( Fig. 2) in our cohort. Interestingly, the frequency is the highest in the convalescents recovered from moderate severity COVID-19 (Fig. 1c, e). Although pre-existing cross-reactive immune memory to SARS-CoV-2 has been suggested 31 , no significant responses observed against the proteins we tested were detectable in samples from healthy subjects of the cohort.

Identification of an immunoprevalent CTL epitope M 198-206
To further examine the importance of M protein-specific CD8 + T cells in the context of SARS-CoV-2 infection, we sought to identify immunodominant epitopes of the M protein. The candidate M protein epitopes were obtained in silico as follows: candidate binders to HLA-A * 24:02 were screened using full-length M protein amino-acid sequence (YP_009724393.1) and the Immune Epitope Database (https://www.iedb.org), and the top 5 peptides with the highest score (0.5 <Score, Percentile rank <0.2) and having a typical amino acid length were chosen (Table 1). Once M protein-responding libraries were obtained, they were further expanded and divided into multiple wells to examine the response to each candidate peptide by coculturing with aAPCs pulsed with the individual peptide. Next, the supernatant was collected to measure IFNγ levels again. For each candidate peptide, the response was defined as positive when IFNγ levels were statistically higher than those of controls (i.e., stimulation with aAPCs alone) and the mean of IFNγ levels was more than twice that of controls. The majority of the libraries responded to only a single peptide, but some responded to two or more peptides, indicating the presence of two or more SARS-CoV-2-M-specific T cell clones among the 2000 original CD45RO + CD8 + T cells. In total, 37 libraries responded to M protein-expressing aAPCs (aAPC-M) from the 7 convalescents were tested for their antigen specificity. Surprisingly, 81.1% (30/37) of the libraries were found to respond to the M 198-206 peptide (Fig. 2a, b). Moreover, the response to M 198-206 was detected in the libraries from 6 out of 7 convalescents (Fig. 2b, c). In addition, in 5 out of 6 convalescents whose libraries responded to M 198-206 , the frequency in the M 198-206 responding library was more than half (Fig. 2c) activation induced marker (AIM) assay with the same set of M peptides. As expected, M 198-206 response was observed in the convalescents having M-responding libraries (CV-004, and CV-007, Supplementary Fig. 5), but not in the one having no responding library (CV-006, Supplementary Fig. 5). There was minor difference in the peptide response between these assays. This might be due to the different populations to be identified by assays; the library assay identified IFNγ + CD8 + CD45RO + cells whereas AIM assay identified the cells which became positive for selected activation markers after peptide stimulation.
Next, we further expanded the libraries obtained from moderate convalescents and performed the following downstream experiments because the libraries from severe convalescents tended not to expand well and did not reach enough cells. The effector features of M 198-206specific CD8 + T cells, cytokine profiles were examined by intracellular cytokine staining (ICS). As shown in Fig. 3a, M 198-206 -specific CD8 + T cells produced the typical inflammatory cytokines: IFNγ and TNFα.  libraries were subjected to the assay. As shown in Fig. 3b, antigenspecific cytotoxic activities were proportional to the effector/target (E/ T) ratio. Furthermore, a TCR αβ pair cloned from an M 198-206 -specific CD8 + T cell line was functional (Supplementary Table 2) ( Supplementary  Fig. 7). To examine direct effector functions of M 198-206 -specific CD8 + T cells to SARS-CoV-2-infected lung epithelial cells, Calu-3 cells were infected with SARS-CoV-2-Wuhan strain for the coculture with M 198-206specific CD8 + T cells. Remarkably, as shown in Fig. 3c, M 198-206 -specific CD8 + T cells suppressed not only intracellular viral RNA replication, but also suppressed propagation of infectious virus in SARS-CoV-2-infected Calu-3 cells, identifying M 198-206 as a SARS-CoV-2 epitope of CTLs. To examine immunological relevance of M 198-206 -specific CD8 + T cells, we studied the M 198-206 -specific CD8 + T cells in the peripheral blood of COVID-19 convalescents with different clinical severities without in vitro expansion. As expected, tetramer-positive CD8 + T cells were detected (Fig. 4a, b) with significantly higher frequency in peripheral blood of moderate or severe COVID-19 convalescents, which is consistent with the results from the libraries (Fig. 2d). In addition, we examined the frequency of virus-specific CD8 + T cells which were reported as prevalent virus-specific CD8 + T cells in previous reports 30,32,33 . As shown in Supplementary Fig. 8d, in our cohort, the frequency of M 198-206 -specific CD8 + T was higher than that of ORF1b or ORF3a-specific CD8 + T cells. Further, additional convalescents who suffered moderate/severe COVID-19 in early 2022 (CV-052, 057, 062, 065, 071, and 073) (Supplementary Table 1), when Omicron strain spread in Japan, were examined. In fact, among the six convalescents from moderate/severe COVID-19, three were tested for Omicron and all of them were found to be positive for Omicron. Five out of six moderate-severe COVID-19 convalescents harbored M 198-206 -specific CD8 + T cells with similar frequency to late 2020 (Fig. 4g). Importantly, M 198-206 specific CD8 + T cells were detected in the peripheral blood of COVID-19 convalescents for more than a year (Fig. 4f). We also found that an M 198-206 -specific CTL line suppressed propagation of Omicron strain ( Supplementary Fig. 8c). Taken together, we concluded that M 198-206 is an immunoprevalent CTL epitope in our study cohort.

Phenotypes and signatures of M 198-206 -specific CTLs
By using M 198-206 MHC tetramer, we addressed the question as to whether the status of the virus-specific CD8 + T cells, such as  differentiation, exhaustion, or senescence, reflects disease severity.
With flow cytometry we found that the tetramer-positive CD8 + T cells in peripheral blood were significantly skewed toward an effectormemory (CCR7 -CD45RA -) phenotype in moderate group, but not in severe group (Fig. 4c). The frequency of inhibitory receptor PD-1 + cells in tetramer-positive CD8 + T cells was higher than that in total CD8 + T cells in the severe group, suggesting increased exhaustion of M 198-206 -specific CTLs; this observation supports several recent reports ( Fig. 4d) [34][35][36] . Interestingly, the senescence marker CD57 + was significantly higher in tetramer-positive CD8 + T cells of patients in the moderate group, but not those of patients in the severe group (Fig. 4e).
These observations were not due to the difference in the time point after the infection ( Supplementary Fig. 8a). The gating strategy for these analyses is demonstrated in Supplementary Fig. 9. Functionality of M 198-206 -specific CD8 + T cells in the moderate group was confirmed by detection of response to peptide stimulation; they secreted IFNγ and TNFα upon the peptide stimulation as observed in the libraries ( Supplementary Fig. 8b). Next, we performed single-cell RNA-sequencing (scRNA-seq, 10X Genomics Platform) analysis on M 198-206 -specific CD8 + T cells. A total of 18,222 tetramer-positive CD8 + T cells were isolated via florescenceactivated cell sorting (FACS) from 10 PBMC samples derived from 6 convalescents (three moderate and three severe convalescents; in two moderate convalescents, samples from different time points after disease onset were included (Supplementary Table 3). Each sample was individually stained with Hashtag antibodies, followed by FACSbased isolation, then mixed and subjected to sequencing. As a result, single-cell transcriptomic data were obtained from 4,452 single cells. Uniform manifold approximation and projection (UMAP), a bioinformatic dimension reduction algorithm, identified 11 clusters (cluster 0 to 10) (Fig. 5a). Since cluster 9 (C9) and C10 did not include enough number of cells (<1%), these clusters were removed from the further analysis. Differentially expressed genes (DEGs, one cluster vs. rest of the cells) of each cluster and selected featured genes are shown in Fig. 5b. Clusters could be roughly divided into two groups (group 1 and 2); group 1 includes C1, C2, C3, C4 and C7 and group 2 includes C0, C5, C6, and C8. Of note, group 1 highly expressed cytotoxic-effector genes including GZMB, GZMA, and TBX21 with some preferences in the expression of FCGR3A, PRF1, GNLY, CX3CR1, GZMH, or activation markers HLA-DR and CD38 etc. (Fig. 5c and Supplementary Fig. 10a). Additionally, Gene set enrichment analysis with a consensus list of cytotoxicity signature genes 37 , also showed high score in group 1 (Fig. 5g), demonstrating that these are the cytotoxic-effector or memory cells (T cyto-eff/mem ). B3GAT1 (CD57) expression was also localized in group 1 clusters (Supplementary Fig. 10a). Furthermore, DEG analysis between group 1 and 2 showed that in group 1, several cytotoxicity markers were upregulated while some of the naïve markers (e.g., CCR7, TCF7, LEF1) were downregulated, demonstrating that the group 2 includes naïve/less differentiated cells (Fig. 5d-f, Supplementary Fig. 10a). In fact, naive markers including CCR7, CD62L, CD28, and CD27 were highly expressed in C8, suggesting that C8 is the cluster of naïve cells or memory stem cells (hereafter designated as 'T naive-like ') ( Fig. 5c, d and Supplementary Fig. 10a). The expression of naive markers was gradually decreased from C8 towards C5 (Fig. 5d), indicating the early differentiated status of C5. To elucidate the features distinguishing C0 and C6 from the rest of clusters, we explored marker genes (see "Methods"). Curiously, GZMK was identified as a selectively expressed gene in both C0 and C6, whose expression was slightly decreased toward C6 (Fig. 5c, d). Recent papers have reported GZMK as a marker of predysfunctional cells or precursor of exhausted T(T PEX ) cells, which have a distinct fate commitment to exhausted cells (T EX ) in human memory T cell pool 38 . In line with this, C0 highly expressed TCF7 and IL7R and intermediately expressed ZNF683, PDCD1 (Fig. 5d, Supplementary Fig. 10a) consistent with gene signature of T PEX cells. As expected, C6 highly expressed T EX markers including inhibitory receptors such as PDCD1 and TIGIT (Fig. 5d). Regarding other inhibitory receptors, CD244 was expressed in C6 (T EX ) as previously reported 39 (Supplementary Fig. 10a). Another inhibitory receptor LAG3 was expressed in C6 (T EX )and C0 (T PEX ) 40,41 (Supplementary Fig. 10a). Cytotoxic genes PRF1 and GZMB were highly expressed in C6 and moderately in C0, that are similar to the characters of T EX and T PEX cells 39,41 (Fig. 5d and Supplementary Fig. 10a). The expression pattern of transcription factor genes such as EOMES, TOX, TCF7, TBX21, are also consistent with previously reported signatures of T EX and T PEX cells [41][42][43][44][45][46][47] ( Fig. 5d and Supplementary Fig. 10a). These findings could support that C6 and C0 are corresponding to T EX and T PEX , respectively. Moreover, C6 showed high score by analysis with exhaustion signature gene list employed in a recent work 37 (Fig. 5h). Trajectory analysis revealed the sequential distribution of the heterogenous cell-states along the C0-C6 axis towards C6 (Fig. 5i). Thus, we concluded that trajectory from C0 to C6 represents a unique trajectory towards the exhaustion of SARS-CoV-2-specific CD8 + T cells. Although the number of convalescent subjects was small, cells from subjects with moderate disease tended to group into T cyto-eff/mem cells (group 1 clusters). In contrast, cells from subjects in the severe category tended to group into T EX cells (C6) (Fig. 5j). This observation supported the results from the flow cytometry analysis where the severe group had higher PD-1 and lower CD57 expression (Fig. 4d, e). Furthermore, to confirm the increased exhaustion phenotype of M 198-206 -specific CD8 + T cells in the severe group, we focused on TIGIT, an additional exhaustion marker for T cell, because C6 specifically co-expressed PDCD1 and TIGIT (Supplementary Fig. 10b). We compared the frequency of PD-1 + TIGIT + cells between moderate and severe group through flow cytometry analysis of SARS-CoV-2-M 198-206 -specific CD8 + T cells. As expected by scRNA-seq results, the frequency of the cells was significantly higher in the cells from severe group compared with those from moderate group (Fig. 5k). Of note, time post viral clearance was not correlated with the frequency of PD-1 + TIGIT + cells (Supplementary Fig. 8e).
Identification of public TCRs of virus-specific CD8 + T cells scRNA-seq analysis provided TCR sequences of the SARS-CoV-2-M 198-206 specific CD8 + T cells. TCR clonal expansion analysis revealed that in all the clusters except C8 (T naive-like ) and C5, a majority of the clones expanded well (more than three cells observed, Fig. 6a) as expected by the signature of those clusters (T cyto-eff/mem , T PEX , and T EX ). Among TCR sequences of the top 20 expanded clones, clone 14 from the CV-001 convalescent had the same sequences as TCRα rank1 β rank1 clone isolated from CD8 + T cell libraries (Supplementary Table 2, Supplementary Table 5). Interestingly, clone 58 from the same convalescent had the same amino acid sequences of α and β chains as clone 14 while there is a difference of a single nucleotide of α chain, suggesting that this TCR was advantageous in forming significant proportions of CTLs in this convalescent. Longitudinal analysis of the top 20 expanded clones from two convalescents (CV-001 and CV-004) detected long-lived memory CD8 + T cells resided likely in C0, C2, and C4, suggesting that these clusters included memory cells. (Fig. 6c).
Although obtained TCR sequences showed the heterogeneity of the TCR clones, there were a few TCRs shared among different convalescents; they are demonstrated as common TCRβ-1, -2, -3, and -4 in Fig. 6b and Table 2. Of note, amino acid sequence of CDR3 in common TCRβ-1 was shared among all members of the moderate group of convalescents (CV-001, CV-003, and CV-004) with difference in a single nucleotide in all the subjects. Among TCR clones with common TCRβ−1, clone 151 and clone 893 had identical amino acid sequence of CDR3 in TCRα while clone 91 had similar but different amino acid sequence of CDR3 in TCRα, in which an amino acid at position 4 of CDR3 was different (Table 2). Moreover, amino acid sequence of CDR3 in TCRβ-2 was different only in an amino acid at position 8 from TCRβ-1. Among TCR clones with TCRβ-1 or TCRβ-2, clone 29, clone 151 and clone 893 had identical amino acid sequence of CDR3 in TCRα while clone 209 had very closed, but different amino acid sequence of CDR3 in TCRα, in which an amino acid at position 4 of CDR3 was different from clone 151, 893, 91, and 29. Collectively, we found public TCRαβ motif: "CAVXYNQGGKLIF" for α motif and "CASSDSGXDGYTF" for β motif. The identification of the public TCRs also could highlight the importance of M 198-206 immunoprevalent epitope recognition in SARS-CoV-2 clearance.

Discussion
Even at present, SARS-CoV-2 is spreading across the world and accumulating mutations and causing COVID-19 with diverse clinical features. Especially in elder populations, life-threating outcomes are manifested; age-specific infection fatality rate has been estimated to exponentially increase to 15% at age 85 1 . Intensive efforts to tackle this issue are on-going from immunological point of view. Recent comprehensive analyses demonstrated innate immunity (e.g., type I interferon) is a critical contributing factor to the course of COVID-19. In parallel with innate immunity, contributions of the adaptive immune system (e.g., cellular immunity) have been demonstrated 3,5 . In this study, we focused on CTLs, a critical effector population for virus clearance, and performed a thorough characterization of virus-specific CTLs from COVID-19 convalescents with different severities focused on M 198-206 , an immunoprevalent CTL epitope to understand how differentiation status of virus-specific CTLs is related to disease severity.
We screened T cell libraries as previously reported 18,48-50 by constructing several artificial antigen-presenting cells which express SARS-CoV-2 proteins, individually. The system was optimized to screen for virus-specific IFNγ + CD8 + T cells. M protein was identified as an immunocompetent viral protein, and M 198-206 was the dominant epitope of the M protein. Compared to the recent papers reporting comprehensive analyses of peptide epitopes in which many structural and non-structural protein epitopes were identified 22,51 , our screening results were skewed toward the M protein. This discrepancy could come from the difference in the epitope preference of each HLA haplotype since our study focused on HLA-A * 24:02. Additionally, the other papers identified cross-reactive T cells present prior to the COVID-19 pandemic in their cohorts 52,53 , which we could not detect in our cohort. It could also influence the memory T cell pool in the convalescents and could affect the screening results, accordingly. The other possibility was that the polyclonal expansion step in our system, driven by TCR stimulation and cytokines etc. could favor specific memory or effector T cell clones because theoretically there should be T cell subpopulations having different proliferative capacities within the population of CD45RO + cells. Thus, we might have underestimated the antigen-specific responses. On the other hand, one of the strong points of the assay is the ease of downstream analysis such as cytotoxicity assays, cytokine profiling by ICS, and tetramer staining, etc. The recently developed activation-induced marker (AIM) assay and ELISpot assay have successfully identified antigen-specific cells and are employed by many researchers. In a simple way, which increases the accessibility and feasibility of the assays, the assays picked the antigenspecific cells primed by the epitope in vivo. But since these assays need to consume most of subjected cells for analysis, it is difficult to perform the downstream analysis, which requires sufficient numbers of cells. Above all, it should be noted that all these assays focus on different aspects of the cells; for example, the AIM assay focuses on activation marker-positive cells shortly after the peptide stimulation, whereas the library assay focuses on IFNγ-producing CD8 + T cells from CD45RO + population in this study.
Importantly, M 198-206 -specific CD8 + T cells were detected with significantly higher frequency at the periphery of convalescents suffering moderate/severe COVID-19 from late 2020 through early 2022, consistent with the reports demonstrating M protein as well conserved viral protein among various SARS-CoV-2 strains (https://nextstrain. org/ncov/gisaid/global). In addition, the M 198-206 -specific CD8 + T cell lines showed antigen-dependent IFNγ and TNFα production as well as cytotoxic activity and suppressed propagation of SARS-CoV-2 strains: Wuhan and Omicron. Thus, we concluded that M 198-206 was one of the immunoprevalent CTL epitopes in our cohort. Previously M 198-206 was examined as an HLA-A*30:01 or HLA-A*24:02-restricted CD8 + T cell epitope, but has not been highlighted as a crucial viral epitope associated with COVID-19 severity 29,33,54 .
Next, we performed in-depth analysis of the immunoprevalent epitope M 198-206 -specific CD8 + T cells to examine their characteristics. M 198-206 -specific CD8 + T cells were detected at substantially high frequency in the moderate and severe groups, which enabled us to study their phenotypic differences by conventional flow cytometry analysis. We found that the exhaustion marker PD-1 was significantly high in the severe group, compared with the moderate group. In contrast, senescence/terminal differentiation marker CD57 was significantly lower in the severe group than the in moderate group. Additionally, tetramer-positive cells of the moderate group were highly skewed towards effector-memory cells, which was disturbed in the cells of the severe group.
We further took advantage of the high frequency of M 198-206specific cells in the moderate and severe groups of our cohort and performed scRNA-seq analysis by isolating a total of 18,222 tetramerpositive cells from moderate and severe convalescents by FACS. As a result, this revealed a highly heterogenous state of the virus-specific CD8 + T cells. In addition to cytotoxic-effector/memory populations, we found exhausted/pre-exhausted populations along a unique trajectory; the trajectory was from GZMK-expressing progenitors of exhaustion/pre-exhausted cells to related exhausted cells, which had the highest signature scored by gene set enrichment analysis with inhibitory receptor expression such as PDCD1 and TIGIT and limited cytotoxic gene expression. Surprisingly, T PEX cluster accounted for the biggest population on UMAP. It is of note that the cells from severe group tended to be accumulated into T EX cell cluster, compared with those from the moderate group, which supported the results from the PD-1 and/or TIGIT staining experiment by flow cytometry analysis (Figs. 4c-e, 5k). As reported in chronic viral infection or cancer 55 , functional loss of exhausted SARS-CoV-2-specific CTLs could result in a failure of proper elimination of the virus, which results in severe outcomes. There has been a debate regarding the exhaustion state of T cells in COVID-19. Recently Dr. Shin's group beautifully showed that PD-1 expressing CD8 + T cells in COVID-19 were functionally active in terms of IFNγ production 56,57 . It would not be contradictory because the loss of IFNγ production occurs only in severely exhausted PD-1expressing CD8 + T cells as reviewed in elsewhere 57,58 .
There are several papers performing scRNA-seq on SARS-CoV-2specific CD8 + T cells 37,59 . Early studies successfully analyzed AIM + cells as SARS-CoV-2-specific CD8 + T cells 37 , but MHC tetramer technology would capture different status of the cells because this could isolate the cells without stimulation. Recently an excellent study was reported by Dr Dong's group; NP 105-113 -B*07:02-sepecific CD8 + T cells were extensively analyzed with MHC-NP 105-113 peptide tetramer 51,59 . NP 105-113specific CD8 + T cells were frequently detected in mild group, but less in the severe group, and clonotypes with different functional avidity were detected in their cohort. Consistent with our results, scRNA-seq analysis showed high expression of granzyme K in NP 105-113 -specific CD8 + T cell from severe group 59 . Thus, together with this report, our data strengthened the basic concept that CTLs are as crucial immune effector cells to determine disease severity.
Single-cell TCR-sequencing analysis identified numerous clonotypes among the COVID-19 convalescents. Of note, several public TCRs were identified. These included public TCRαβ-motifs shared in 3/3 of the moderate convalescents, which could further highlight the importance of the M 198-206 epitope recognition in COVID-19. As expected, the transcriptomic status of the cells (i.e., distribution of the cells on UMAP) within and across the clonotypes were heterogenous, suggesting that the fate of CTLs such as cell division, differentiation, survival or cell death should have been regulated by various factors including basal state of the cells, amount and frequency of the antigen stimulation or co-stimulation, cytokines etc., during T cell activation in the clinical course of disease. Recent advanced platforms are accumulating information on public TCRs in infectious diseases and cancers [60][61][62] . In HIV-infected patients, there are rare subjects who could control viral propagation without therapy, called HIV controllers 61 . Public TCRs were reported to be crucial for control of HIV in those subjects; these TCRs showed high affinity to Gag293, which the most immunoprevalent CD4 epitope in HIV capsid 61 . Although further analyses are required, public TCRs identified in our experiment might contribute to recovery or lead to less severity. Since public clonotypes of SARS-CoV-2-specific T cells has been accumulating 63,64 , our information on public clonotypes of M-specific CTL could be important to consider further strategies against unresolved disaster.
We analyzed CTL phenotype in the context of a single M 198-206 epitope without including other epitopes or bystander T cells as controls for scRNA-seq analysis. Also, these analyzes were performed with low numbers of moderate/severe COVID-19 convalescents. Additionally, one HLA haplotype: HLA-A * 24:02 was focused on in this study because of high frequency in our study cohort. These are limitations of this study, therefore, continuous accumulation of the data from different cohorts with different immunocompetent epitopes is required. Furthermore, deeper association of exhausted phenotype and signature with SARS-CoV-2-M 198-206 -specific CTLs from convalescents of severe disease could raise the possibility; it is a consequence of heightened immune activation that is associated with severer disease. However, since we observed less coordinated differentiation status of SARS-CoV-2-M 198-206 -specific CTLs in the severe group comparing to the moderate group (Fig. 4c), it could be interpreted as a cause of severe COVID-19. In order to corroborate such an interpretation, detailed investigation of virus-specific CTLs in immune compromised hosts with different severity are required.
In conclusion, we propose the trajectory towards exhaustion as a SARS-CoV-2-specific CTL fate to dysfunction in COVID-19. This could lead to poor outcomes presumably due to insufficient innate immune system in COVID-19 (e.g., type 1 IFN signaling etc.). Moreover, M 198-206 could be highlighted as a crucial CTL epitope to determine COVID-19 severity. These results could provide a platform for understanding severe COVID-19 pathogenesis in relation with dysfunction of cellular immunity.

Study participants and ethics
COVID-19 convalescents were recruited from hospitals affiliated with Hyogo Medical University or Kyowakai Medical Corporation. COVID-19 convalescents were classified into three groups (i.e., mild, moderate, and severe) based on the extent of oxygen supplementation and requirement of mechanical ventilation (mild: no oxygen supplementation, moderate: oxygen supplementation FiO 2 < 0.5, severe: heavy oxygen supplementation FiO 2 > 0.5 and/or mechanical ventilation) according to a recent report 22 Table 1. Ethical approval was given by the ethics committee of Hyogo College of Medicine (reference: 202104-144). The consent to publish clinical characteristics of all the participants was obtained through written informed consent. Peripheral blood was drawn from convalescents recovered from COVID-19 with different severities or healthy volunteers after written informed consent was given. PBMCs were isolated by Ficoll-Hypaque gradient centrifugation and genomic DNA were purified using QIAamp DNA blood mini kit (51104, Qiagen). All the subjects were tested for HLA-A DNA typing (GenoDive Pharma Inc.) and the ones positive for HLA-A * 24:02 were subjected to the CD8 + T cell library assay and/or MHC tetramer staining etc. Our reporting of clinical data complies to the STROBE guidelines.

Cell lines
VeroE6/TMPRSS2 65 (JCRB 1819) cells were at 37°C in 5% CO 2 in Dulbecco's modified Eagle's medium (DMEM) (Thermo Fisher Scientific) supplemented with 10% heat-inactivated fetal bovine serum and 1 mg/ ml G418. Calu-3 (ATCC HTB-55) cells, a human lung epithelial cell line, were maintained in Minimum Essential Medium (MEM) (Thermo Fisher Scientific) supplemented with 20% heat-inactivated fetal bovine serum. For the establishment of another VeroE6 cells expressing TMPRSS2, a vesicular stomatitis virus (VSV)-G pseudotyped lentivirus having human tmprss2 gene was produced using 293FT cells. VeroE6 (ATCC) cells infected with the pseudotyped virus were selected with 300 mg/ml hygromycin for at least 1 week. These bulk-selected cells were used for detecting SARS-CoV-2 viral RNA in supernatants from infected Calu-3 cells. TG40/CD8a cells were cultured at 37°C in 5% CO 2 in RPMI medium (Wako) supplemented with 10% heat-inactivated fetal bovine serum. CD8 + T cell library assay CD8 + T cell library assay was performed as previously described 18  Fresh cytokines were added every 3 days. On day 9, the libraries were screened for antigen specificity by coculturing with irradiated-artificial antigen-presenting cells (aAPCs) expressing a series of viral proteins described above. CMV pp65 341-349 (QYDPVAALF) and Influenza PA 130-138 (YYLEKANKI) pulsed aAPCs (without expressing SARS-CoV-2 viral proteins) were also used. Then, the culture supernatant was harvested for IFNγ measurements by ELISA. In some experiments, CD45RO + CD8 + T cell libraries with positive responses were further expanded by adding cytokine cocktail every 3 days for approximately 14 days and then restimulated with control, SARS-CoV-2 viral protein-expressing aAPCs, or antigen peptide-pulsed aAPC.

Flow cytometry and cell sorting
Flow cytometry were performed as previously described 18 . For cellsurface labeling, CD8 + T cell libraries or PBMCs were stained with the antibodies for 30 min on ice. The cells were then analyzed by BD LSRFortessa (BD Biosciences) with BD FACSDiva (V8.0) or MACSQuant Analyzer (Miltenyi Biotech) with MACSQuantify (version 2.4) or sorted using BD FACSAria II (BD Biosciences). Data analyses were performed with FlowJo (v10.4.2) (TreeStar). For the intracellular cytokine staining, cells were stimulated with indicated peptide for 2 hours and then further incubated in the presence of brefeldin A for 4 hours. After cellsurface staining, cells were fixed and permeabilized. Intracellular cytokines were detected with specific monoclonal antibodies using FoxP3/transcription factor staining buffer set (00-5523-00, eBioscience) according to the manufacturer's instructions.

Cytotoxicity assay
M 198-206 -specific CD8 + T cells were enriched as effector cells as follows. M 198-206 -responding libraries from COVID-19 convalescents were further expanded and enriched with cytokines in the presence of irradiated (45 Grey) M expressing aAPCs for a couple of weeks. Then the cells were harvested and M 198-206 specific cells were purified with M 198-206 MHC tetramer-PE and anti-PE microbeads (Miltenyi). The resulted cells were confirmed to be M 198-206 tetramer + with the purity of >95%. Next, in a 96-well plate, peptide-pulsed Calu-3 cells or unpulsed control cells were labeled with Calcein-AM (Dojin) as previously indicated 67 . Then, the effector cells were added or not added to the target cells with different E/T ratios as indicated. 24 hours later, the cells were extensively washed and intracellular calcein levels were measured using a fluorescence microplate reader Infinite M200 Pro (TECAN) (λ Em = 490 nm, λ Ex = 520 nm). Wells were triplicated for each condition and the percentage of killing was calculated as ((OD of no-effector added wells)-(OD of effector added wells))/(OD of no-effector added wells)×100.

Preparation of SARS-CoV-2 virus stock
The SARS-CoV-2 isolate (UT-NCGM02/Human/2020/Tokyo) 68 and the Omicron isolate (BA.1 linage, TY38-873) from the National Institute of Infectious Diseases, Japan were propagated in VeroE6/TMPRSS2 (JCRB 1819) cells in DMEM containing 5% heat-inactivated fetal bovine serum at 37°C in 5% CO 2 . Briefly, SARS-CoV-2 was added at a multiplicity of infection (MOI) of 0.01 to VeroE6/TMPRSS2 (JCRB 1819) cells and incubated for 30 min at 37°C. The culture medium was replaced with fresh medium. Cells were incubated for an additional 48 hours. The supernatant was centrifuged at 800×g for 5 minutes to remove cell debris. The supernatant was stored as virus stocks at −80°C. The virus titer was determined by plaque assay using VeroE6/TMPRSS2 (JCRB 1819) cells.

SARS-CoV-2 infection assay
Calu-3 cells were seeded at 2 × 10 4 cells per well in a 96-well cell culture plate. The following day, cells were infected with SARS-CoV-2 for 30 min at an MOI of 0.1 for Wuhan strain and MOI of 1 for Omicron strain. Cells were washed with PBS and incubated in fresh medium for 24 hours. The medium with or without 1 × 10 5 or 2 × 10 5 CTLs were added to the cells. Cells were incubated for an additional 24 hours. Supernatants were collected and stored at −80°C after cell debris were removed by centrifugation at 800 g for 5 min. To measure the amount of viral RNA amplified in Calu-3 cells, the cells were washed three times with PBS and cell-lysis and cDNA synthesis were performed using SuperPrep II Cell Lysis & RT Kit for qPCR (TOYOBO) according to the manufacturer's instructions. To measure the amount of infectious viral particles released from infected Calu-3 cells, 10 µl of the supernatants were incubated with VeroE6/TMPRSS2 (ATCC) cells seeded in a 96-well cell culture plate for 24 h. After washed three times with PBS, cells were lysed and cDNA was synthesized using SuperPrep II Cell Lysis & RT Kit for qPCR (TOYOBO) according to the manufacturer's instructions.

Peptide competition assay
Direct association of the M 198-206 peptide with HLA-A*24:02 was examined using the components of QuickSwitch Quant Tetramer Kit-PE (TB-7302-K1) according to the manufacturer's instructions. Briefly, HLA-ABC Magnetic Capture Beads were mixed with or without QuickSwitch Tetramer or M 198-206 tetramer. Then, the beads were rinsed and stained with FITC-labeled Exiting Peptide antibody. The beads were rinsed and subjected to flowcytometry analysis using BD LSRFortessa (BD Biosciences). Mean fluorescent intensity (MFI) of FITC channel were calculated using FlowJo (v10.4.2) (TreeStar). The frequency of Exiting Peptide was calculated as follows; no-tetramer control and QuickSwitch Tetramer control were calculated as 0 and 100 percent, individually. Then, generate a linear curve by plotting the MFIs obtained with two controls against percent Exiting Peptide. Finally use the MFI of M 198-206 tetramer for calculating the percentage of peptide exchange.

TCR analysis
For the repertoire analysis, total RNA from SARS-CoV-2-M 198-206 specific cell line was purified and subjected to next-generation sequencing (Repertoire Genesys Inc., Osaka, Japan). cDNAs of TCR alpha and beta chains were linked by T2A sequence and subcloned into pMX-IRES-GFP vector. Ecotropic 293 T cells were used as packaging cells and resulted retroviral supernatant was collected. Viral transduction of the genes to TG40/CD8 cells were performed as previously described with minor modifications 67,69 .

Single-cell transcriptome analysis
Single-cell libraries were prepared with reagents and instructions from 10x Genomics. cDNA was amplified for 14 cycles, and up to 50 ng of cDNA were used for gene expression libraries. Doublets were removed by using Scrublet 70 . The top 4000 highly variable genes were selected, and used for clustering. Further data analysis was done with BBrowser platform (version 3.3.6, Bio Turing). Cytotoxicity signature and exhaustion signature scores were generated using published lists of genes 37 (Supplementary Table 4) with BBrowser 71 . Dimensionality reduction was done by UMAP (uwot package: https://github.com/ jlmelville/uwot.), the number of neighbors is set at 30. Louvain clustering on the PCA results was run by igraph package 72 with a flexible number of nearest neighbors. To detect marker genes, a nonparametric Venice method was used 73 . Venice was also utilized for differential expression analysis between two groups. For trajectory analysis, information about cell embeddings on UMAP were fed to monocle3′s algorithm to obtain graph's structure 74 .

AIM assay
AIM assay was performed as previously described 20 . PBMCs were cultured for 24 hours in the presence of peptide (10 µg/ml) or DMSO in 96-wells U bottom plates at 1 × 10 6 cells/well. CD69 + CD137 + cells in CD8 + T-cell population were detected as AIM + by flow cytometry.

Statistical analysis
Comparisons were made using the indicated statistical tests using GraphPad software (version 7.02). Unless indicated, Mann-Whitney or Wilcoxon tests were applied for unpaired or paired comparisons, respectively. For library studies, wells greater than mean + 3 SD of the IFNγ levels for wells cultured with aAPCs (without expressing SARS-CoV-2 viral proteins) were calculated for each subject were considered positive. The percentage of positive library wells is presented as: (number of positive wells/total number of wells) × 100.

Reporting summary
Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.

Data availability
Immune Epitope Database was accessible online (https://www.iedb. org/). scRNA-seq data of SARS-CoV-2-M 198-206 -specific CD8 + T cells generated in this study have been deposited in the Gene Expression Omnibus datasets under accession code GSE209676. The remaining data are available within the paper and Source Data file provided with this paper. Source data are provided with this paper.